BIJUNG:13.1.2 희소 보상(Sparse Reward) 환경에서의 한계: 단순 탐험(Epsilon-greedy) 전략의 실패와 구조적 탐험의 필요성